urlfile="https://raw.githubusercontent.com/gthampak/Arm_Guy_MATH150_Project/main/HELPdata.csv"

HELPdata <- read_csv(url(urlfile))

Introduction

Connecting with medical care is a choice. Many different factors could be involved when deciding whether or not to connect with medical care. Perception of your own personal health influences your decision to connect with medical care for non-major medical emergencies. Although we are able to do some basic self diagnosis with help of the internet, there is a high chance of a misdiagnosis given the various issues with access to useful and accurate medical information. Since we do not deem it a necessity if our health is not in a critical condition, we may decide not to connect with medical care. Apart from the personal health perceptions, there are also perceptions of health care system, which could include the ability to pay for health treatment. Lastly, one may not be in the optimal mental space to decide whether or not to connect with medical care, which could be due to poor mental health or substance abuse.

There have been many studies and surveys which try to gauge the accuracy of an individual personal health perception and their views towards healthcare. We attempt to widen the scope of personal perception of general health to investigate factors which could influence an individual’s view and opinions on their own health as well as primary healthcare. Better understanding the influences on one’s decision to connect with medical care can help focus efforts in specific sectors to give care to people who require care but don’t know it themselves.

We begin by identifying factors which affect personal health perception, such as age and education. Using a coxph model, we find significant factors in predicting times to link to primary healthcare. Since the factors are associated with primary healthcare linkage, we assume that they also affect the individual’s perception of health. If you perceived yourself as healthy, the time to link with primary healthcare will be longer than those of percieve their health to be poor.

(move from general to specific)

background info (look at source 1)

motivation?

Aim/Hypothesis

Our primary goal was to assess how personal perception of someone’s health and their perception of healthcare affects the effectiveness of novel multi-disciplinary clinic for linking patients in a residential detoxification program to primary medical care.

Methods

First, we looked at variables that are related to health and health perception and separated them into different categories.

General/ASI-Composite scores

Questions related to opinion on and habits towards healthcare

SF-36 Scores

Drugs-related variables

Drug and Healthcare related variables

Interview’s perspective on Patient

Demographics and Education-related variables

Model Building

The functions we used for our primary data analyses are coxph(Surv()) to test for significance of Hazard Ratio coefficients and survfit(Sruv()) to plot survival curves. First we put all the variables above into a single model and removed the ones with highest significance one at a time. We also ran models with variables from a single category (above), and removed variables with highest signifiance. We did this to prevent putting highly correlated variables from the same category into the main model. We also tried looking for interaction between variables, but none were significant.

(make sure replicable)

Explanation of variables we explored, why we explored them (in relation to our mainhypothsis/primary aim)

Categorize variables explored (sf scores??)

Results (Model goes here!)

Exploratory Data Analyses

First, we wanted to see whether a higher pain score translate to a lower self rating of health. We hypothesize that it does.

From this visualization, we see that as pain score increases, the people who rated their health to be excellent increases, which makes sense.

Next, we were interested in the correlation between age and perception of future health. We hypothesis that the people of older age will more likely believe that their health will get worse. This is mostly true as the percentage of people who believe their health will get worse increases as age increases, as shown below.

Next, we looked for patterns between perception of mental health and general health. This visualisation is an effort to answer the question, do individuals incorporate their mental health to how they rate their general health. It seems that the majority of people who think their general health is bad also are suffering from poor mental health, which means that mental health is also considered in overall general health.

Now, we investigate variables associated with education as education may affect the individuals perception of health care, their knowledge of the health care system and their ability to self diagnose themselves. We plot a quick histogram to see whether years of education (a9) correspond with the high school variable (hs_grad) and it does. 12 years of formal education is when high school finishes, and this is the case here.

Here, we note that a lot of the patients in the study are high school graduates.

T-tests

With that, we wanted to know whether high school graduates and non-high school graduates view the importance of medical treatment differently. We performed a t-test between hs_grad and d5_rec, which asks patients whether Medical treatment is important (0=No, 1=Yes). We get a p-value of 0.7741, and cannot reject the null hypothesis that the proportion of patients who think medical treatment is important are the same between high school graduates and non-graduates.

## 
##  Welch Two Sample t-test
## 
## data:  d5_rec by hs_grad
## t = 0.28744, df = 198.75, p-value = 0.7741
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.09854964  0.13218183
## sample estimates:
## mean in group 0 mean in group 1 
##       0.5523810       0.5355649

Another factor we thought could potentially affect patients view on the importance of medical treatment are those who use substances. We chose to do a t-test to see whether the proportion of people who view medical treatment as important is different between patients who have alcohol as their primary substance and those who do not. The p-value for this t-test is 0.03, which is a significant difference and we reject the null hypothesis. The test shows that the patients with alcohol as their primary substance is more liely to view medical treatment as important.

## 
##  Welch Two Sample t-test
## 
## data:  d5_rec by alcohol
## t = -2.1824, df = 254.19, p-value = 0.02999
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.23130020 -0.01187097
## sample estimates:
## mean in group 0 mean in group 1 
##       0.4640000       0.5855856

We also wanted to see whether the proprotion of high school graduates and non-high school graduates who view medical treatment as important is the same. According to the t-test below, the difference in proportion between the two groups is not significant.

## 
##  Welch Two Sample t-test
## 
## data:  d5_rec by hs_grad
## t = 0.28744, df = 198.75, p-value = 0.7741
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.09854964  0.13218183
## sample estimates:
## mean in group 0 mean in group 1 
##       0.5523810       0.5355649

Significant Variables after prelimiinary models * group * alcohol * coc_her * hs_grad * a9 (education) * pf * any_util

Survival Analysis Model

HELPdata_survfit <- survfit(Surv(dayslink, linkstatus) ~ group + alcohol + hs_grad + any_util + age, data=HELPdata)

#ggsurvplot(HELPdata_survfit, conf.type = "TRUE")

coxph(Surv(dayslink, linkstatus) ~ group + alcohol + hs_grad + any_util + age, data=HELPdata) %>% tidy()

Our final model includes the following variables: * group * alcohol * hs_grad * any_util * age

From this coxph model, group‘s coefficient estimate is 1.73914222 which means that patients in treatment group 1 are \(e^{1.73914222} = 5.692458\) times as likely to recieve primary care at any given time that patients from treatment group 0. p-value for this coefficient is 1.621803e-14, which means we reject the null hypothesis that treatment group has no effect on ’risk’ of seeking primary healthcare (with all else constant).

alcohol‘s coefficient estimate is 0.49420991 which means that patients with alcohol as their primary substance are \(e^{0.49420991} = 1.639203\) times as likely to recieve primary care at any given time that patients who do not have alcohol as their primary substance. p-value for this coefficient is 1.527822e-02, which means we reject the null hypothesis that alchol as primary substance does not correlate with ’risk’ of seeking primary healthcare (with all else constant).

hs_grad‘s coefficient estimate is -0.54287790 which means that patients with who are high school graduates are \(e^{-0.54287790} = 0.5810736\) times as likely to recieve primary care at any given time that patients who are not high school graduates. p-value for this coefficient is 3.189917e-03, which means we reject the null hypothesis that high school graduation does not correlate with ’risk’ of seeking primary healthcare (with all else constant).

any_util‘s coefficient estimate is -0.40873422 which means that patients with recent health utilization are \(e^{-0.40873422} = 0.6644908\) times as likely to recieve primary care at any given time that patients who have no recent health utilization. p-value for this coefficient is 4.755411e-02, which means we reject the null hypothesis that recent health utilization does not correlate with ’risk’ of seeking primary healthcare (with all else constant).

age‘s coefficient estimate is 0.02386218 which means that a one year increase in age corresponds to a \(e^{0.02386218} = 1.024149\) multiplicative factor increase in ’risk’ to link with primary care at any given time. p-value for this coefficient is 4.548318e-02 , which means we reject the null hypothesis that age does not correlate with ‘risk’ of seeking primary healthcare (with all else constant).

Discussion

There are factors related to perception of personal health and healthcare that affect an inidividual’s likelihood to link with primary healthcare. In the model above, we show five such factors, four of which are more general (not only applicable to this study).

The significance of the group variable shows that treatment 1 increases the likelihood of a patient linking up with primary healthcare.

The significance of the alcohol variable shows that people who ackowledge their high consumption and dependence on alcohol are like liklely (1.639203 times as likely) to link up to primary healthcare, especially if they have seeked help on their drinking problems before. We see from our exploratory data analysis that people with alcohol as their primary substance are more likely to view medical treatment as important (difference in proportion 95% CI: (-0.23130020, -0.01187097)).

The significance of the hs_grad variable shows that high school graduates are less willing to link up with primary healthcare (0.5810736 as likely) compared to non high school graduates. This may be because with minor problems, high graduates may be more confident in self-diagnosing and looking for solutions themselves before seeing the need to link up with primary healthcare. From our preliminary data analysis, we found that the proportion of high school graduates and non-high school graduates who view medical treatment as important are not significantly different (difference in proportion 95% CI= (-0.09854964, 0.13218183)). Thus, we believe that a possible reason high school graduates are less likely to link up with primary health care is because they are less likley to seek medical treatment under less severe circumstances/sickness.

The significance of the any_util variable in the model shows that patients who have recently recieved healthcare are less likely to link up with primary healthcare (0.6644908 times as likely). A possible reason is because healthcare is expensive and people may be less likely to visit primary healthcare successively, they may have had an unpleasant visit, a recent confirmation from a hleathcare professional may increase their confidence in regards to their health, so they are less likely to think linking with primary healthcare is necessary.

The significance of the age variable in the model shows that the older you are, the more likely you are to link up with primary healthcare (1.024149 as likely for every one year increase). This directly relates to our initial exploratory data analysis where we saw that people in older age groups have more negative outlooks on their future health, which could explain their increased likelihood of linking with primary healthcare (with all else constant).

Does the author adequately relate the results of the current work to previous research?Does the author appropriately discuss to whom the results can be generalized?


Does results answer question?

Be clear with why we accept of reject null hypotheses

Relate work to previous research,

Interesting points (potentially) * Variables that are significant tend to be binary variables with an even distribution of yes/no answers (or even more lob-sided towards yes (1) response). We suspect this is because more observations in each group directly translates to higher power, a lower absolute difference can result in lower (and potentially significant) p-values.

Sources/References

Perception of Health and Use of Health Care Services in a Swedish Primary Care District. A ten Year’s Perspective https://www.tandfonline.com/doi/pdf/10.3109/02813439109026592

https://stats.idre.ucla.edu/r/examples/asa/r-applied-survival-analysis-ch-6/

http://www.sthda.com/english/wiki/cox-model-assumptions